Stanford
Diversity Can Be Transferred: Output Diversification for White-and Black-box Attacks, Yang Song 1 Department of Computer Science, Stanford University, Stanford, CA, USA
Adversarial attacks often involve random perturbations of the inputs drawn from uniform or Gaussian distributions, e.g., to initialize optimization-based whitebox attacks or generate update directions in black-box attacks. These simple perturbations, however, could be sub-optimal as they are agnostic to the model being attacked. To improve the efficiency of these attacks, we propose Output Diversified Sampling (ODS), a novel sampling strategy that attempts to maximize diversity in the target model's outputs among the generated samples. While ODS is a gradient-based strategy, the diversity offered by ODS is transferable and can be helpful for both white-box and black-box attacks via surrogate models. Empirically, we demonstrate that ODS significantly improves the performance of existing whitebox and black-box attacks. In particular, ODS reduces the number of queries needed for state-of-the-art black-box attacks on ImageNet by a factor of two.
Deep learning approaches to surgical video segmentation and object detection: A Scoping Review
Kamtam, Devanish N., Shrager, Joseph B., Malla, Satya Deepya, Lin, Nicole, Cardona, Juan J., Kim, Jake J., Hu, Clarence
Introduction: Computer vision (CV) has had a transformative impact in biomedical fields such as radiology, dermatology, and pathology. Its real-world adoption in surgical applications, however, remains limited. We review the current state-of-the-art performance of deep learning (DL)-based CV models for segmentation and object detection of anatomical structures in videos obtained during surgical procedures. Methods: We conducted a scoping review of studies on semantic segmentation and object detection of anatomical structures published between 2014 and 2024 from 3 major databases - PubMed, Embase, and IEEE Xplore. The primary objective was to evaluate the state-of-the-art performance of semantic segmentation in surgical videos. Secondary objectives included examining DL models, progress toward clinical applications, and the specific challenges with segmentation of organs/tissues in surgical videos. Results: We identified 58 relevant published studies. These focused predominantly on procedures from general surgery [20(34.4%)], colorectal surgery [9(15.5%)], and neurosurgery [8(13.8%)]. Cholecystectomy [14(24.1%)] and low anterior rectal resection [5(8.6%)] were the most common procedures addressed. Semantic segmentation [47(81%)] was the primary CV task. U-Net [14(24.1%)] and DeepLab [13(22.4%)] were the most widely used models. Larger organs such as the liver (Dice score: 0.88) had higher accuracy compared to smaller structures such as nerves (Dice score: 0.49). Models demonstrated real-time inference potential ranging from 5-298 frames-per-second (fps). Conclusion: This review highlights the significant progress made in DL-based semantic segmentation for surgical videos with real-time applicability, particularly for larger organs. Addressing challenges with smaller structures, data availability, and generalizability remains crucial for future advancements.
Simple, Scalable and Effective Clustering via One-Dimensional Projections Monika Henzinger Stanford University Institute of Science and Technology Austria (ISTA) Stanford, CA
Clustering is a fundamental problem in unsupervised machine learning with many applications in data analysis. Popular clustering algorithms such as Lloyd's algorithm and k-means++ can take Ω(ndk) time when clustering n points in a d-dimensional space (represented by an n d matrix X) into k clusters. In applications with moderate to large k, the multiplicative k factor can become very expensive. We introduce a simple randomized clustering algorithm that provably runs in expected time O(nnz(X) + n log n) for arbitrary k. Here nnz(X) is the total number of non-zero entries in the input dataset X, which is upper bounded by nd and can be significantly smaller for sparse datasets.
Noether's Learning Dynamics: Role of Symmetry Breaking in Neural Networks Physics & Informatics Laboratories, NTT Research, Inc., Sunnyvale, CA, USA Stanford University, Stanford, CA, USA
In nature, symmetry governs regularities, while symmetry breaking brings texture. In artificial neural networks, symmetry has been a central design principle to efficiently capture regularities in the world, but the role of symmetry breaking is not well understood. Here, we develop a theoretical framework to study the geometry of learning dynamics in neural networks, and reveal a key mechanism of explicit symmetry breaking behind the efficiency and stability of modern neural networks. To build this understanding, we model the discrete learning dynamics of gradient descent using a continuous-time Lagrangian formulation, in which the learning rule corresponds to the kinetic energy and the loss function corresponds to the potential energy. Then, we identify kinetic symmetry breaking (KSB), the condition when the kinetic energy explicitly breaks the symmetry of the potential function. We generalize Noether's theorem known in physics to take into account KSB and derive the resulting motion of the Noether charge: Noether's Learning Dynamics (NLD). Finally, we apply NLD to neural networks with normalization layers and reveal how KSB introduces a mechanism of implicit adaptive optimization, establishing an analogy between learning dynamics induced by normalization layers and RMSProp. Overall, through the lens of Lagrangian mechanics, we have established a theoretical foundation to discover geometric design principles for the learning dynamics of neural networks.
Adaptive Learning of Rank-One Models for Efficient Pairwise Sequence Alignment Microsoft Research New England, Cambridge, MA Department of Electrical Engineering, Stanford University, Stanford, CA
Pairwise alignment of DNA sequencing data is a ubiquitous task in bioinformatics and typically represents a heavy computational burden. State-of-the-art approaches to speed up this task use hashing to identify short segments (k-mers) that are shared by pairs of reads, which can then be used to estimate alignment scores. However, when the number of reads is large, accurately estimating alignment scores for all pairs is still very costly. Moreover, in practice, one is only interested in identifying pairs of reads with large alignment scores. In this work, we propose a new approach to pairwise alignment estimation based on two key new ingredients. The first ingredient is to cast the problem of pairwise alignment estimation under a general framework of rank-one crowdsourcing models, where the workers' responses correspond to k-mer hash collisions. These models can be accurately solved via a spectral decomposition of the response matrix. The second ingredient is to utilise a multi-armed bandit algorithm to adaptively refine this spectral estimator only for read pairs that are likely to have large alignments. The resulting algorithm iteratively performs a spectral decomposition of the response matrix for adaptively chosen subsets of the read pairs.
Simple, Scalable and Effective Clustering via One-Dimensional Projections Monika Henzinger Stanford University Institute of Science and Technology Austria (ISTA) Stanford, CA
Clustering is a fundamental problem in unsupervised machine learning with many applications in data analysis. Popular clustering algorithms such as Lloyd's algorithm and k-means++ can take Ω(ndk) time when clustering n points in a d-dimensional space (represented by an n d matrix X) into k clusters. In applications with moderate to large k, the multiplicative k factor can become very expensive. We introduce a simple randomized clustering algorithm that provably runs in expected time O(nnz(X) + n log n) for arbitrary k. Here nnz(X) is the total number of non-zero entries in the input dataset X, which is upper bounded by nd and can be significantly smaller for sparse datasets.
Prediction strategies without loss Rina Panigrahy Stanford University Microsoft Research Silicon Valley Stanford, CA
Consider a sequence of bits where we are trying to predict the next bit from the previous bits. Assume we are allowed to say'predict 0' or'predict 1', and our payoff is +1 if the prediction is correct and 1 otherwise. We will say that at each point in time the loss of an algorithm is the number of wrong predictions minus the number of right predictions so far. In this paper we are interested in algorithms that have essentially zero (expected) loss over any string at any point in time and yet have small regret with respect to always predicting 0 or always predicting 1.
Embedding Democratic Values into Social Media AIs via Societal Objective Functions
Jia, Chenyan, Lam, Michelle S., Mai, Minh Chau, Hancock, Jeff, Bernstein, Michael S.
Can we design artificial intelligence (AI) systems that rank our social media feeds to consider democratic values such as mitigating partisan animosity as part of their objective functions? We introduce a method for translating established, vetted social scientific constructs into AI objective functions, which we term societal objective functions, and demonstrate the method with application to the political science construct of anti-democratic attitudes. Traditionally, we have lacked observable outcomes to use to train such models, however, the social sciences have developed survey instruments and qualitative codebooks for these constructs, and their precision facilitates translation into detailed prompts for large language models. We apply this method to create a democratic attitude model that estimates the extent to which a social media post promotes anti-democratic attitudes, and test this democratic attitude model across three studies. In Study 1, we first test the attitudinal and behavioral effectiveness of the intervention among US partisans (N=1,380) by manually annotating (alpha=.895) social media posts with anti-democratic attitude scores and testing several feed ranking conditions based on these scores. Removal (d=.20) and downranking feeds (d=.25) reduced participants' partisan animosity without compromising their experience and engagement. In Study 2, we scale up the manual labels by creating the democratic attitude model, finding strong agreement with manual labels (rho=.75). Finally, in Study 3, we replicate Study 1 using the democratic attitude model instead of manual labels to test its attitudinal and behavioral impact (N=558), and again find that the feed downranking using the societal objective function reduced partisan animosity (d=.25). This method presents a novel strategy to draw on social science theory and methods to mitigate societal harms in social media AIs.
Scalable Polar Code Construction for Successive Cancellation List Decoding: A Graph Neural Network-Based Approach
Liao, Yun, Hashemi, Seyyed Ali, Yang, Hengjie, Cioffi, John M.
While constructing polar codes for successive-cancellation decoding can be implemented efficiently by sorting the bit channels, finding optimal polar codes for cyclic-redundancy-check-aided successivecancellation list (CA-SCL) decoding in an efficient and scalable manner still awaits investigation. This paper first maps a polar code to a unique heterogeneous graph called the polar-code-construction message-passing (PCCMP) graph. Next, a heterogeneous graph-neural-network-based iterative messagepassing (IMP) algorithm is proposed which aims to find a PCCMP graph that corresponds to the polar code with minimum frame error rate under CA-SCL decoding. This new IMP algorithm's major advantage lies in its scalability power. That is, the model complexity is independent of the blocklength and code rate, and a trained IMP model over a short polar code can be readily applied to a long polar code's construction. Numerical experiments show that IMP-based polar-code constructions outperform classical constructions under CA-SCL decoding. In addition, when an IMP model trained on a length-128 polar code directly applies to the construction of polar codes with different code rates and blocklengths, simulations show that these polar-code constructions deliver comparable performance to the 5G polar codes. Yun Liao and John M. Cioffi are with the Department of Electrical Engineering, Stanford University, Stanford, CA 94305, USA (email: yunliao@stanford.edu; Seyyed Ali Hashemi is with Qualcomm Technologies, Inc., Santa Clara, CA 95051, USA (email: hashemi@qti.qualcomm.com). Hengjie Yang is with Qualcomm Technologies, Inc., San Diego, CA 92121, USA (email: hengjie.yang@ucla.edu).
Using Interpretable Machine Learning to Massively Increase the Number of Antibody-Virus Interactions Across Studies
Department of Statistics, Stanford University, Stanford, California, United States of America *Authors contributed equally to this work Correspondence should be addressed to teinav@fredhutch.org Abstract A central challenge in every field of biology is to use existing measurements to predict the outcomes of future experiments. In this work, we consider the wealth of antibody inhibition data against variants of the influenza virus. Due to this virus's genetic diversity and evolvability, the variants examined in one study will often have little-to-no overlap with other studies, making it difficult to discern common patterns or unify datasets for further analysis. To that end, we develop a computational framework that predicts how an antibody or serum would inhibit any variant from any other study. We use this framework to greatly expand seven influenza datasets utilizing hemagglutination inhibition, validating our method upon 200,000 existing measurements and predicting 2,000,000 new values uncertainties. With these new values, we quantify the transferability between seven vaccination and infection studies in humans and ferrets, show that the serum potency is negatively correlated with breadth, and present a tool for pandemic preparedness. This data-driven approach does not require any information beyond each virus's name and measurements, and even datasets with as few as 5 viruses can be expanded, making this approach widely applicable. Future influenza studies using hemagglutination inhibition can directly utilize our curated datasets to predict newly measured antibody responses against 80 H3N2 influenza viruses from 1968-2011, whereas immunological studies utilizing other viruses or a different assay only need a single partially-overlapping dataset to extend their work. In essence, this approach enables a shift in perspective when analyzing data from "what you see is what you get" into "what anyone sees is what everyone gets." Introduction Our understanding of how antibody-mediated immunity drives viral evolution and escape relies upon painstaking measurements of antibody binding, inhibition, or neutralization against variants of concern (Petrova and Russell, 2017). Every interaction is unique because: (1) the antibody response (serum) changes even in the absence of viral exposure and (2) for rapidly evolving viruses such as influenza, the specific variants examined in one study will often have little-to-no overlap with other studies (Figure 1).